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Abstract. In computational biology, numerous recent studies have been dedicated to the 
analysis of the chromatin structure within the cell by two-dimensional segmentation methods. 
Motivated by this application, we consider the problem of retrieving the diagonal blocks 
in a matrix of observations. The theoretical properties of the least-squares estimators of 
both the boundaries and the number of blocks proposed by Levy-Leduc et al. [2014] are 


investigated. More precisely, the contribution of the paper is to establish the consistency 
of these estimators. A surprising consequence of our results is that, contrary to the one¬ 
dimensional case, a penalty is not needed for retrieving the true number of diagonal blocks. 
Einally, the results are illustrated on synthetic data. 


1. Introduction 


Detecting change-points in one-dimensional signals is a very important task which arises 
in many applications, ranging from EEG (Electroencephalography) to speech processing and 
network intrusion detection, see Basseville and Nikiforov [T^, [Brodsky and Darkhovsk^ 
[2000 , Tartakovsky et al. 2014 . The aim of such approaches is to split a signal into several 
homogeneous segments according to some quantity. A large literature has been dedicated to 
the change-point detection issue for one-dimensional data. This problem may also have several 
applications when dealing with two-dimensional data. One of the main situations in which 
this problem occurs is the detection of chromosomal regions having close spatial location in 
the nucleus of a cell. Detecting such regions provides valuable insight to understand the 
influence of chromosomal conformation on cell functioning. More precisely, we will consider 
the problem of identifying the so-called cis-interactions between regions of a chromosome. 
In this context, n locations spatially ordered along a given chromosome are considered, the 
goal being to find clusters of adjacent locations that strongly interact. The elements of 
a data matrix y will then correspond to the interaction level between locations i and j of a 


chromosome, which can be measured using the recently developed HiC technologies, see Dixon 


et al. [2012 . In this application, the signal - and consequently the data matrix - exhibits a 
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strong structure: one should observe high signal levels within blocks of locations along the 
matrix diagonal, and a signal that is close to some (low) baseline level everywhere else. 

Levy-Leduc et al. [2014 , the identification of czs-interactions can be cast 


As shown 


m 


as a segmentation problem, where the goal is to identify diagonal blocks (or regions) with 
homogeneous interaction levels. Thanks to the spatial repartition of these regions along 
the diagonal, the two-dimensional segmentation of the data matrix actually boils down to 
a particular one-dimensional segmentation. The dynamic programming algorithm originally 
proposed by Bellman [1961 is well-known to provide the exact solution of the one-dimensional 
segmentation issue in the least-squares sense. Therefore we benefit from the data structure 
by avoiding both the computational burden and the approximation errors that come with 
heuristic methods used to solve the complex generic problem of two-dimensional segmentation. 

While being able to handle large interaction data matrices from an algorithmic point of 
view, model selection (i.e. selecting the number of blocks K) remains an open question 
when dealing with such data. This is contrasted with the problem of one-dimensional signal 
segmentation, for which the properties of the estimators have been largely addressed for 
instance in Boysen et al. 2009j, Lavielle and Moulines 2000 , Yao and Au [1989 


In these 


approaches, the number of change-points is usually performed thanks to a Schwarz-like penalty 


XnK where is often calibrated on data, as in Lavielle 2005 and Lavielle and Moulines 


2000 , or a penalty K{a + b\og{n/K)) as in Lebarbier [2005 and Massart 2004| , where a 


and b are data-driven as well. 

The goal of the present paper is to prove the consistency of the estimators of both the 
boundaries and the number of blocks obtained by minimizing the (slightly modified) least- 
squares criterion proposed by Levy-Leduc et al. 2014 . The proof relies on the strong structure 
of the data which is of great help for the model selection issue and for the algorithmic aspects. 
More precisely, we will prove that the non-penalized least-squares estimators of the number 
of blocks is consistent. 

The paper is organized as follows: Section introduces the modeling of the data and the 
definition of the least-squares estimators that will be considered throughout the article. The 
theoretical properties of the estimators are derived in Section and illustrated on synthetic 
data in Section A discussion is given in Section The technical aspects of the proofs are 
detailed in Section and in the supplementary material. 


2. Statistical Framework 

2.1. Modeling. Let us consider y = (Y,j)i<i,j<n, a symmetric matrix of random vari¬ 
ables. Because of the symmetry, we shall focus on its upper-triangular part denoted by 
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Y = (yi,j)i<i<j<n where the Yij will be assumed to be independent and such that 

^i,j — IE [^j] 1 — * — J — (1) 

The £ij satisfy the following assumption: 

(Al) The Sj j are assumed to be centered, i.i.d. and such that there exists a positive constant 
f3 such that for all u G M, 

We shall moreover assume that the matrix of means {lJ-ij)i<i<j<n is block diagonal. More 
precisely, let r* = (tqjTj^, ... ,t^*) be a vector of break fractions such that 0 = Tq < < 

• • • < = 1. In what follows, the break fractions are fixed quantities: neither their number 

nor their positions change when n grows. The parameters Hij are such that 

^^i,j = ^ Dl, k = l,...,K*, 

= Y{i,j)GEl (2) 

where the (half) diagonal blocks (k = 1,..., K*) are dehned as follows, 

Dl = {{i,j)-.tl_^<i<j<tl-l], (3) 

where = [nr^] + 1 are thus such that 1 = < ■ ■ ■ < t\i, = n + 1, [x] denoting the 

integer part of x. They stand for the true block boundaries and K* corresponds to the true 
number of blocks. In Equation ([^, Eq corresponds to the set of positions lying outside the 
diagonal blocks: 

^0 = {(hi) : 1 < ^ < j n (4) 

where denotes the complement of set A. An example of such a matrix is displayed in 
Figure (left). The following will also be assumed for the true block sizes: 

(A2) For all £, one has 

0 < - -r/l < c, 

where c G (0,1) is a known constant. 

Moreover the Ek satisfy the following assumption: 

(A3) = min “ k-o\ > 0- 

l<k<K* 
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2.2. Inference. In this framework, the inference consists in estimating both the number of 
blocks and the true break fraction vector r* (or equivalently the true boundary vector t*). 
One strategy would be to use the following least-squares criterion: 


t 


LS 

K 


G Argmin 

. p j An 


E E 


k=l (i,j)eDj^ 


+ E 

ii,j)eEo 



(5) 


where Yx> is the empirical mean of the Yij when the indices (i, j) belong to D, and Eq are 
defined as in Q and (|^ except that t* is replaced by t, and K is the considered number of 
segments - K* being unknown in practice. Moreover, 


= {t = {to,... ,tK) ■ to = I < ti < ... < tK = n + I 

and VI < A; < A', nA^, <tk — tk-i < cn} (6) 

is the set of admissible segmentations, where denotes a positive sequence. 

However, thanks to (A[^, one can derive an unbiased estimator of /Xq using the upper-right 
triangle part of the matrix 3^ denoted Gqi and defined by 

Goi = {{hj) : 1 < f < n°, (n — -|- 1) < 3 < n} with nP = [(1 — c)n] . (7) 

Indeed the intersection between the blocks and Gqi will always be empty. Thus, we can 
split Eq into two disjoint sets Gqq and Gqi (see the right part of Figure Q as follows, 

Ao* = GSoUGoi. (8) 


Consequently, we will consider the following slightly modihed least-squares criterion: 


tx G Argmin 

where 

E E (Ui-u.) 

(* j)eDfe 

Lastly, we will consider the following estimator of K*: 

tx 


Quit) = 


+ 


E 

(iJ)eEo 


(y„- - Ycaf 


K = Argmin Q 

l<A'<A^max 


(9) 


( 10 ) 


( 11 ) 


where tx is defined in Q and iVmax is the maximal number of blocks considered. 

Criterion (0 based on has been proposed by |Levy-Leduc et ah| |2014| . The goal 
of our paper is to validate this latter approach theoretically. Note that the main difference 
between ([^ and (10) is the estimation of //q that is independent from the segmentation. 


since 


Goi is fixed. Hence, /Tq can be estimated prior to the optimization of the criterion (10). 
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Figure 1. Left; Example of a matrix (nij) with n = 16 and K* = 4. Right: 
Illustration of the notations used in the estimation criterion. 


As a consequence, this optimization can be performed by using 
algorithm as explained in Levy-Leduc et al. (2014] . 


lliC UyiicXllilL. ]J1 Ug,i dliillliilg 


3. Theoretical results 


The goal of this section is to derive the consistency of K and f. To prove these results, we 
shall need the following assumption on A„: 

\/Tl 

(A4) An,- -—)• +00 and < A* , for large enough n. 

(logn)^/^ n->-+cxD 


Theorem 1. Let Yij be defined by 0- Assume that and hold. Then 

K defined in (11) is such that: 




0, as n ^ + 00 . 


( 12 ) 


Remark 1. Observe that, contrary to classical statistical frameworks, AT is a consistent esti¬ 
mator of K* even if it is obtained without any penalization. 


Remark 2. In Theorem the estimator K is defined as the minimizer of Q^(tx) where tx 
is obtained by minimizing Q^{t) over the set If we are only interested in proving that 
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¥{K < K*) —)• 0, the minimization can be performed on the set instead of i.e 

without any constraint on the minimal distance between two consecutive change-points, see 
Lemma (i) below and Lemmas and which are given in Section 

Remark 3. Theorem is valid under (A[^ which implies that the number of observations 
within each segment increases linearly with n, since = [ut^] + 1. This assumption could be 
alleviated by assuming that A* is no longer a constant. In that case, we shall need to assume 
that A*re^/^/(log tends to infinity, as n tends to infinity. 


Remark 4. The assumption A„ 3> (logn)^/^/\/n of can be understood in the light of 
Lemma (ii) and Equation ( |17[ ) at the end of the proof of Theorem It is required to ensure 
le convergence to zero of the exponential inequalities of the random parts given in Lemmas 


and 


This assumption is only required for proving that P yK > K*j tends to zero 
as n tends to infinity. As a consequence, when the number of blocks is known (K = K*), 
the break fractions consistency is obtained in our paper when A„ = 1/n. Such a choice is 
impossible in the one-dimensional segmentation framework of Lavielle and Moulines 2000[ 
since it is required that nA„ —)■ -|-oo and A„ —)• 0, as n tends to infinity, in order to obtain 
the break fractions consistency when the number of breaks is known. 


Remark 5. In practice, c has to be chosen in order to use the top right part of the matrix 
of observations to estimate the parameter /Tq. This choice can either come from a prior 
biological knowledge or from a simple visualization of the data. In the case of the analysis of 
HiC data, the size of the interaction diagonal blocks are expected to be small compared with 
the size of the chromosome i.e. the size of the data matrix. In this context, c = ‘il^ can be 
safely chosen, as suggested in Levy-Leduc et al. |2014| . If the value of c is misspecified, the 
estimator of /Tq is biased. The consistency result of Theorem 1 still holds if (A3) is replaced 
by .min |//* - E(yGoi)| > 0. 

l</c<K 


Sketch of proof of Theorem^ In order to prove (12), we shall prove that P iK < K*j and 


K > K *) tend to zero as n tends to infinity. Note that 
K*-l 


Kn 


k < K*j < J2^(k = Kj and 


K=1 


K > K*] < Yj ' 

K=K*+1 


K = K 
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Hence, we shall prove that for K < K* and K > K*, 

K = K] —> 0, as n —)• +cx). 


Observe that by definition of K given in (11) 


K = K\ < 


< 


min Q^{t)— min < 0 


K* 


,teA 


n,K 


teA„ 


min Qn{t) - < 0 1 , 


K* 


,teA 


n,K 


since, for large enough n, A„ < A*, and hence t* belongs to Thus, we shall focus on 


min Jn(t) < 0 , 

An / 




where 


{Qn (t) Qn )) ) 


n{n + 1) 

We shall prove in the supplementary material that 


Jn{^) — Bn(t) + W,(t) + Wn{t) + Zn{t), 


(13) 


(14) 


where B„, Wj bbn and Zn are defined by ( [20| , ( [2T| ), (22), ( [^ and (24) in Section]^ In (14), 
B„ corresponds to the deterministic part and the other terms correspond to the random part 
of Jn- 

The remainder of the proof is based on Lemmawhich is proved in Section |6.2| and which 
provides a lower bound for the deterministic part of Jn, and on Lemmas mi and|^ given in 
Section]^ which provide deviation inequalities for the random terms of Jn- 


Lemma 1. Let Bn{t) be defined by (20) and (21), then 
(%) ifK<K\ 


A(o)^ , 

min Bn{t) > (A*)^ , 


teA 


1 /n 


64 


(ii) if K> K*, 


min Bn{t) > =^A„, 


teA: 


(in) if K = K*, for all positive 6, 


A(o) 


mm 


★ a3 




Bn{t) > =—min(A;/2,,5)(A;) 


(15) 


32 
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where A* is defined in is defined in and is defined in Q. 

particular case with A„ = 1/n and 


A/n . 

A'k « 


I* “ t*lloo = |4 - tl\ . 


0<k<K* 


Thus, 


(16) 


min Jn{t) <0 < P min [Bn(t) + 14(t) + Wn(t) + Z„(t)] < 0 


,teX 


, teA 


n,K 


The right hand side (rhs) of the previous inequality is bounded by 


For 


> min B„(t) 


Zn{t) > min 


P — min Vn{t) — min Wn{t) — min 
\ 

bounding this term we shall use Lemma (u). For K > K*, we obtain 

I 


\ / 

mm Jn(t) <0 < P - mm Vn(t) > 


+ 


A(0) 

WnA) ^ 

a 

N ■<-,K 

Lemmas mi and we conclude that 


min 
ta A^'^ 


VFn(t) > + ^ “ 


By 

for K > K*. The case 


(k = k) —^ 0, 

' / n^+oo 

proved by following the same 


K < K* can be 


lines. 


(17) 


□ 


Remark 6. We can observe from Theorem that adding a penalty term is not necessary 
for obtaining a consistent estimator of the number of diagonal blocks. This may be surpris¬ 


ing since, in the one-dimensional case, it is proved in Theorem 9 of Lavielle and Moulines 


2000 that a penalty term is required. More precisely, the main difference between our two¬ 


dimensional framework and the one-dimensional case is the behavior of the deterministic part 
of our criterion B^: it is lower bounded whatever the value of K {K > K* or K < K*), as 
proved in LemmaOn the contrary, in the one-dimensional case, a penalty term of the type 
finK is necessary to obtain such a lower bound when K > K*. In the case where K < K*, 
a lower bound for B„ is obtained without penalization. For further details, see the proof of 


Theorem 9 in Lavielle and Moulines 2000 
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Theorem 2. Assume that the assumptions of Theorem^hold then, for all (5 > 0, 

t*-t; 


''K 


> n6 ) —> 0, 

T-L / n—>-+oo 


(18) 


where is defined in Q and (11) and || • H-^ denotes the Hausdorff distance defined by 
t*-tK 


= max 


H 


max min \tt — tA , max min Itt. — tA 
0<k<K*0<i<K' ' 0<e<K0<k<K*' ' 


Observe that (18) can be rewritten as P (||t* — > 5) —A 0, where = t^/n 


Sketch of proof of Theorem Observe that 


t* - t 


K 


+ ■ 


({ 


n 


> n6] = P 


({ 


t* -t 


K 


n 


> 




t* -t 


K 


n 


> n(5} n = iiT*}) < P ^ iii* ) + P ( - t*||oo > n5 


where ||tx* — t*||oo is defined in (16) since ||tft — t 


= ||t£> — when K = K*. By 




as n ^ + 00 . 


Theorem proving (18) amounts to proving that 

Observe that 

P ( max I it — tfc I > n(5 ) < P ( 

Using the same arguments as those used in the proof of Theorem the proof follows from 
the decomposition of Jn given by (14), the lower bound ( |15[ ) of Lemma and the deviation 
inequalities for the random terms given by Lemmas Si and[^ □ 


mm 


Jn(t) < 0 


4. Numerical experiments 


The goal of this section is to illustrate the theoretical results obtained in Section]^ For an 
application of our method to real data, we refer the reader to Levy-Leduc et al)] [2014]. 


4.1. Simulation framework. We generated Gaussian diagonal block matrices according to 
Model (Q with = 1 for the K* = 5 diagonal blocks and /Xq = 0 for different values of 
n (n G {500,1500}). The change-point locations are (rQ,...,T 5 ) = (0,0.07,0.2,0.4,0.67,1) 
hence A* = 0.07. We shall use different values for the standard deviation a of the Sij: 
a G {1,...,10}. For each case, 500 matrices were simulated and the procedure was tested. 
Examples of such matrices are displayed in Figure for different values of a. 

The results that are presented below have been obtained by using the R package HiCseg 
which is available on the GRAN. In this package, the values of A„ and c are fixed and equal 
to 2/n and 3/4, respectively. 
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(7 = 1 


(7 = 4 




Figure 2. Examples of simulated matrices following Model 0 with 
(tq ,..., T 5 ) = (0,0.07,0.2,0.4, 0.67,1) and n = 500 for two values of cr; (7 = 1 
(left) and cr = 4 (right). 


4.2. Statistical performance. 


4.2.1. Performance of the statistical procedure. We first consider the problem of estimating 
the true number of blocks 77*, and provide some insight about the consistency of our procedure 
without penalty, outlined in Remark The median, 1st and 3rd quantiles of the estimated 
number of change points are displayed in Figurej^for n in {500,1500} and for different values 
of a. 

On the one hand, we observe that for high signal to noise ratios, the true value of K* 
is retrieved by our procedure. On the other hand, when the signal to noise ratio becomes 
very low, K* is not properly estimated. In this situation, K* is overestimated, which is in 
accordance with what occurs in the one-dimensional case where a non-penalized procedure 
would result in a systematic overestimation of 77*. However, when n increases, the value of 
cj from which this overestimation occurs is unsurprisingly larger. 


To illustrate the performance of our procedure in terms of the estimation of change-point 
location. Figure [^displays the boxplots of the two parts of the Hausdorff distance defined by: 


t* 

-^K 


= max min 

0<k<K* o<£<K 

t* 

-^K 


= max min 


tl-M- 


( 19 ) 
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Figure 3. Top: Median (plain), 1st and 3rd quartiles (dotted line) of the 
estimations of K* = 5 as a function of the standard deviation a for n = 500 
(left) and n = 1500 (right). The values of K at each simulation are displayed 
with light grey dots. The dashed line corresponds to the true value of K*. 
Bottom: Same plots with the x-axis values restricted to {1,..., 5}. 

We observe from this figure that when K* is overestimated, the true change-points are recov¬ 
ered (II • ||-^i is close to 0), the other estimated change-points being spurious ones (|| • ||-^2 is 
large). As proved in Theorem]^ this phenomenon is less visible when n becomes large. 

4.2.2. Effect of a poor estimation o//ig. We study the behavior of our segmentation procedure 
when /ig is poorly estimated which may occur, for instance, when the constant c appearing 
in 0 is too small. To this end, we generated data in which the mean of the no x no top right 
part of the observation matrix is modified, where no is defined in Q. More precisely, the 
mean of this part is equal to Pq +u, where io G {0.2,0.4,0.6, 0.8}. The results are displayed 
in Figure]^ We can see from this figure that when the value oi Pq +to is close to the values of 
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n = 500 n = 1500 
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Figure 4. Boxplots of the two parts of the Hausdorff distance: (top) 

and T-L^ (bottom) for n = 500 (left) and n = 1500 (right). For each case, the 
boxplots are displayed as a function of a. 

the means of the diagonal blocks our procedure tends to overestimate K*. This phenomenon 
is less visible when n is large. 


5. Discussion 


In this paper, we established that the (slightly modified) least-squares estimators for the 
number of blocks and their boundaries in a block diagonal matrix are consistent. Note that 
the obtained results are non standard in the sense that we proved that penalizing the least- 
squares criterion is not required to obtain a consistent estimator of the number of diagonal 
blocks. This has to be contrasted with the one-dimensional case, where it is well-known that 
a penalization is required to ensure consistency, see for instance Lavielle and Moulines [2000 


More precisely, a close look at the proof of Theorem 9 in Lavielle and Moulines [2000 shows 
that a penalty is required to discard models such that K > K*. This comes from the fact 
that in the one-dimensional setting when K > K* the deterministic part of Jn vanishes for 
all segmentations t satisfying ||t* — = 0 (i.e. for all segmentations t nested in the true 
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n = 500 


n = 1500 
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Figure 5. Boxplots of iC for ct = 1 (top) and cr = 4 (bottom) for n = 500 
(left) and n = 1500 (right). For each case, the boxplots are displayed as a 
fnnction of uj. 


segmentation t*). This bias term being null, a penalty term has to be added to the criterion 
to compensate the stochastic deviations of the random terms in J„. In the two-dimensional 
setting, the deterministic part does not vanish when K > K* -as proved in Lemma 
ensuring consistency. 

The framework that we have chosen for proving our results consists in assuming that the 
observations are independent and that the size of the observation matrix is large (asymptotic 
framework), which is adapted to the analysis of HiC experiments. From a practical point of 
view, the independence assumption is not always satisfied, for instance when the observation 
matrix is a correlation or a similarity matrix, see for example Dehman et al. [2015 , Ioanna 
Delatola et al. [2015 . Hence, relaxing the independence assumption to retrieve diagonal block 
boundaries in such cases would be a natural extension of this paper. Moreover, it could be 
interesting to see if a penalty term needs to be added to our criterion in order to retrieve 
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properly the break fractions in a non asymptotic setting. This will be the subject of a future 
work. 


6. Proofs 

6.1. Definition of B„, Vn, Wn and We define hereafter B„, Vn, Wn and Zn which 


appear in (14) by: 

B^t) = Bf (t) + bO (t), Vn{t) = Kf (t) + = W^{t) + W°(t), (20) 

and 


K 




n{n + 1) 




B 


= E {E|Uyl-E[>fc..])l (21) 


(*J)eGoo 


C(t) = 


n(n + 1) 


K* 

E 

k=l 


(Ec. 


jjeD* 


\Dl\ 


K 

E 

fc=i 




)&Dk 


21 


\Dk\ 


n(n + l) |Goi|2 ( ^ ) 

\{*j)eGoi 


W^{t) = 


n{n + 1) 


K* 


y~! ( X] I 

k=i V(ij)eE>* 

= 


A" 

E 

k=l 

4 


E 




E[Ti 






„(n + i)^u. ^ ^i,3 

(*j)sGqq (*j)sGoo 


E 


, (23) 


^n(t) 


(*j)eGoi / 


I X] ^*4 X] ^*4 


{*j)sGoo 

(24)' 


In the equations, Gqq and Goi are defined in ([^ and ([^ and Goo has the same definition as 
Gqq except that t* is replaced by t. 
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6.2. Proof of Lemma We shall first rewrite and B® defined by (21). Let us first 
denote by 

nk,i = {DkCi D^\ , (25) 

the number of observations that belong to the intersection of the two blocks and (with 
the convention that Dq = Goo and Dq = Gqq) and 

K* K 

Rfc = ^ nk,£ and ^ Uk/. 


e=o 


k=0 


Since E [Igoi] = Mo> G'oo C (^UEo E [Yij] = for all {i,j) 


G D^, we obtain 


K* 


= S (ETC-rtf 

^ ^ (hi)eGoo ^ ^ i=0 (i,j)eGoonD* 


K* 


n 


(^n + 1) ^ 


(26) 


Since \Dk\ = Ylf=o\^k n Dj\ = Y.f=o'^k,£ = Uk, 


K* 


E[yi 


1 


K* 


Dk\ 


1 E = E E|rMl = f (27) 

^ i=0 (ij)eDfcnD* ^ r=o 


nr ^ 


where we use for all fe G {1,..., iL}, Hfc C ^U^o • Thus, 


AT* 




{i,j)^Dk 
K* 


1 


(iJ)eDfc 
K* 


t '=0 


n 


1 


^ UfcE [Yij] - Y, 
^ 1=0 (i,i)eDfcnD* \ i'=o 

K* K* K* 


1 


/ £=o Lr'=o 


X* 


X* 


ifki— 


n 


1 


EEE 'klk,£'klk,£i'klk,£2 (/^£ ^^£l) (.^^£ ^^ 2 ) 

k £=0£i=0r2=0 
X* X* 


X* 


n 


X] X] nk,£nk,£, Ye - Y) Y1 (/^r “ Y) 


k e= 0 £i =0 


£ 2=0 


1 

nk 

1 

n-fc 


X* X* 


X* 


X* X* 


£=0 £ 1=0 


X* X 


EE nk,£nk,£^ [Y - Y^^*e) - 2 E EE nk,£nk,£i {l4 - Y) 

" " " " ^ ^ ^ ^ 


€ 2=0 


£=0 £1=0 


=0 


X* X* 


EE nk,£nk,£^ {Y - ^ EE nk,£nk,£' Y£ — Mn)^- 


£=0 £1=0 


£=0 £'=0 
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Hence, 


1 


n(n + l)^ 


K* K* 


EE nk/nk/' (a 


Uk 

k=l i=Oi'=0 




(28) 


t 


e-i 


ti-i 


t* 


ti 


f* 


^k-i — ti t\ — ti+i 

















rii- 
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Figure 6. Left: K < K*. Right: K > K*. 


6.2.1. Case K < K* and t G Observe that B„(t) > Since K < K*, tx — t*j^ = 

^K* ~ Hence, {k,tk — > nA*/2} / 0. Let i = min{k,tk — > A*/2}, then 

> 1 and 

ti-i <t*i- nA;/2 <t*i+ nA;/2 < te. 

By definition of A* , 

ne,e = \Di n D*i \ > - te-i){4 - ti_i + l)/ 2 , {t} - + l)/ 2 } 

> inKf/8, (29) 


and 


nifi > - ti_i), (f^+i - f|)(t^ - > {nA^^f /4. 


(30) 


Thus, using (29) and (30), we obtain 

Bn(t) > ^ ne^iuefi {fiQ - 


n{n + l}ni . 


> 


(nA;)^ ^ (A*)^A(°)^ 


n{n + l)n£ 32 


64 
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since ni < n{n + l)/2. 


6.2.2. Case K > K* and t G We have 

B„(t) > (t) > -r^T^rio^k “ /^o)^ > 

n{n + 1) n{n + 1) 

for any k G {0, ..., K*}. Since t G there exists i G {1, ... ,K — 1} such that for all 

k€{0,...,K*} 

, .i. , nA„ 

(otherwise, it will imply that K < K*). Moreover, let us choose k such that + nAnl‘1 < 
ti < — nAnl2 then 

no,k > {ti - tl_,) (4 - U) > . 

This leads to 

B„(t) > 

6.2.3. Case K = K* and t G ||t — t*||^ > nb. We have 

Bn{t) > / ^1 (31) 

n(n + Ij ni 

for every I G {1, ..., iC} and every C G {1, ..., K*}. Then, we shall consider two cases: i) 
l|t - tiloo < ^ and ii) ||t -1*||^ > 

We shall assume that tk — 4 = ||t — > 0. 


There are two possible configurations (see Figure [^. If 4-i ^ < 4 ^ then, by 

definition of A*, we obtain 




/ 

(4 ~ 4-i) ~ {tk-i — 4-i) I (4 ~ + 1 ) 


_ 4i-tk-i)i4-tk-i + i) ^ 

^k,k — ^ 


Otherwise, if tfc-i < 4-i ^ 4 ^ we obtain 

(4 - 4-i)i4 - 4-1 + 1 ) > (nA*)2 ^ (nA*)2 


> 


(nA; 


-k\2 


(32) 




> 


2 


2 


(33) 
















* 
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Figure 7. K = K* and ||t —1*||^ < Left: t'^_i < tk-i < < tk- 

Right: tk-1 < tl_i < tl < tk- 


Then, by using the above decomposition of — tk-i), we obtain 

'kl-kfi ^ (tk tk—l)(tk tk), 

( 

> 


(tl - tl.i) - (tk -1 - tLi) I (tk - tl) > -fn^5. 


\ >nA* 




Af 


=||t—t* 


By choosing (£ = k,£' = k) in {31), and by using (32), ( p^ and (34), we obtain 

B„(() > 

n(n + 1) rifc 8 2 32 

• • i mj. j.ik-11 \ nA* 

n) ||t — t Iloo > — 2 ~' 


(34) 


Since K = K*, there exists k such that t\ — tk > and tk+i — tk> (otherwise, this 
would imply that K > K*). As above, there are two possible cases, either tk < % < < 

tk+i or tk <tl< 4+1 < tlj^-^ (see Figure]^. 

If tk < t^ < t^+i < tfc+i) we obtain, by definition of A*, 

(tUi-tl)(tUi-tl + l) ^ (nA*)2 


^fc+i,fc+i 


2 


2 


(35) 
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f* 

^k-1 


tk 


f* 


tk+1 


f* 

^k+1 


f* 

^k-l 


tk 


f* 


f* 

''k+1 


tk+1 




Figure 8. K = K* and ||t —1*||^ > Left: t^ < < tk+i- 

Right: tk<tl< 4+1 < tl^^. 


and 


R-fc+ijO ^ (^fe+i tk){tk 4) ^ (^^t) 


(nA*) 


If tk <tl < 4+1 < we obtain 

(4+1 - tl)itk+i -tl + 1) 




2 V 2 


(36) 


(37) 


and 


^fc+1,0 ^ {tk+l tk){tk tk) A 


(nA* 


(38) 


By choosing (7 = 7' = A: + 1) in ( |M| ), and by using (35), (36), (37) and (38), we obtain 

B„(t) > 

6.3. Deviation inequalities. 

Lemma 2. For all a > 0, 

\ n(n + l)c>: _ jGQjJa 

— min Vn{t) >a < n(n + l)e + 2e , 


tsA 


1/n 
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where Vn is defined by (20) and (22) and is defined in (6) with = 1/n. Moreover, if 


a = an is such that ann^/log(n) —)• cx) and an\GQfi —)■ oo, as n tends to infinity, then 


— min Vn{t) > an \ —S' 0, as n ^ +cx). 


teA 


1 / 
n,K 


The proof is given in the supplementary material. 


Lemma 3. Let Wn be defined by (20) and (23), then there exists Ci > 0 such that for all 
a > 0 : 


( — min ITn(t) > a 1 

< exp 

o?n{n F 1 ) 

V / 


[ 128,d(R: + l)^(R:* + l)^A J 


where A = sup |/r^ — and A^J^^ is defined in (6) with A„ = 1/n. Moreover, if a = an is 
such that a^n^/log(n) —)■ oo, as n tends to infinity, then 


— min Wn{t) > an \ —t 0, as n ^ oo. 


teA, 


1 / 


The proof is given in the supplementary material. 
Lemma 4. For all a > 0 and 7 > 0, 


— mm 


Zn(t) > a I < 2e ' + 2e 


iGoj>2„2 


teA: 


1/1 

n,K 


where Zn is defined by (24), A^n^ defined in (6) with A„ = 1/n and A = sup|//^ — 

' ' ’ ' ' k^e 


Moreover, if a = an is such that a„n"^/log(n) —)• 00 and a„n'^|Go,i| —S' 00 , as n tends to 
infinity, then 


— min Zn{t) > an \ —S' 0, as n ^ 00 . 


teA, 


1 / 


The proof is given in the supplementary material. 
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Supplementary material 


Proof of Equation ( |14[ ). The goal of this section is to prove that Jn defined in can be 
rewritten as in (14). By definition of given in (10), 


Jn{^) — Jn (t) + 


(39) 


where 


A°(t) = 


n(n + 1) 


K 


k=l {i,j)eDk 


K* 

E 

k=l 


E PL-rb; 


and 


= 




n{n + 1) 


Using (1) and Yd^ = E [YdJ + \Dk\ ^ where \A denotes the cardinality of 

the set A, we obtain 


E (y‘j-yDY= E + 


(hi)eT>fc 

(hi)eT>fc 


{i,j)^Dk 


-2 E 
+ E 

(*j)eT>fe 


E [y,j] E [Yn,] + [Yd,] + E [Y,,] — ( ^ 




E [Yd,]"+ 2E [Yd: ^ 


\Dk\ 


Y1 I + 

Ai'J')eD, 


1 


E 

{i',j')eDk 


\Dk 

2 - 




\Dk\^ 




By gathering the deterministic terms and the terms linked to the noise, we obtain 


E 

{y^J-yD,f 



' ( 

\Dk\ 



rTT n \ 2 „ 1 


(Li)eDj, 


\Dh 




M’,j')£Dk ) {i,j)&D, 


( 40 ) 
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Thus, for defined in (39), we obtain 


4^(t) = 

n[n + 1 ) 


K K* 

>2 


E E -EE (eKjI-e 

*^=1 k=l {i,j)&Dl 


Ydi 


=0 




n(n -|- 1 ) 
2 

n(n -|- 1 ) 
2 


k=i ' Vh'dOeDfe I ii,j)eDk k=i ' k\ \ (i/j/)eD* 




E 

{hi)&Di 


E^ E -E^ _ 

k=i ' ''' \(i',i')eDfe / k=i ' k\ \ (i/j/)eD 


2n 


E 




K* 


E E (2K [E,,] 6„- + 4,) - E E (2E [Yij] £ij £^j) 


nin + 1 ) . 

^ ^ ' fc=i (i,i)eDfe fc=i (ij)eDj 

= B^(t) + fTi'(t) + E„^(t) 

K K* 

'I I « 

-h 


n(n -|- 1 ) 


E E (2E +4A - E E (2E [Tij] Eij -|- e^j) 


^A:=l {iJ)&Dk k=l 

since E[yjj] = E[Yd*], for all {i,j) G D^.. Using (1^, we obtain 


(41) 


^S(t) = 


E (Yu-Yao,)"- E (Yj-Ygo,? 


n(n -I- 1) 


Using 0 andUcoi =E[>"goi] +|Goi|-'E(*' j')eGoi obtain 


E {Y.j-YG„y= E 

(*j)eGoo (*j)sGoo 

E + 2E[yij]eij+4 

(*j)eGoo 


y4-2y,,,yGo,+yJo, 


2 E * [y,,,] E [Eg^J + £.,,-e [Eg^J + E [y,,,-] ^ 


(*j)sGoo 

1 




di'j')6Goi 


E 




d*M')6Goi 


+ E |‘^U«T + 2E[rG„]j^( E 

(ij)eGoo \(i',j')6Goi 


£i\j' + 


IGoil' 


E 




d*'j')6Goi 
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By gathering the deterministic terms and the terms linked to the noise as in (40), we obtain 


(*j’)sGoo {*j)sGoo 


o,> ^ ol^ool 

^\r< 

* ^ \ Lt( 

(*j)sGoo 


01 


E ^*' 4 ' 


+ 


IG, 


oil 


E ^*' 4 ' 


d*'d')eGoi 

+ ^ (2E \Yi^j\ Eij + Ei j ), 

(*,j)sGoo 


(*',i')6Goi 

|Goo| 

|Goi| 






E 


I 


d E 




d*'j')6Goi 


. (ij)eGoo 


where / 1 q is defined in ([^. Thus, we obtain 


= ^ (E[y,,]-E[yGoJ) - E (eEE-eEgoE^ 

+ '(M)eGoo (M)eG5o 


i^o E 


^ ' \{i,j)£Goo 0d)6G5(, 


E 


+ n(n+l)|Gl|2 I E £.-,,'1 (IGool-IGSol) 


n(n+l)|G„.l 




E m,j)- E eEm)-E(igooI-igsoI) 

,(*,i)6Goo ii,j)&G*o 


1 


n(n + l)|Goi| I,, 


E ^*' 4 ' E 


^i,j 


E 




i,(bi)sGoo (i,j)&Gt 


+ E7ETTT ^ (2E [y,,] e,, + ey - (2E [y,,] e,, + ^) 

^ ^ ^ \(bi)eGoo H.mGl, 


= B0(t) + fy0(t) + K0(t) + y„(t) 

2 


+ 


n{n + 1) 


E ( 2 ®^ + E') - E ( 2 ®^ Ed] + <i) > 


. ( 2 j')^Goo 




(42) 
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since IE[yjj] = /Xq = E [Igqi] j for fol ihj) £ G*oo- Then, from (39), (41) and (42), we obtain 


J^t) = B^(t) + W^{t) + V^{t) + B^(t) + <(t) + + Z„(t) 

K K* 


+ [^bi] e*,i + 4,j) - ^ X] [^bi] £*4 + 4,j) 

^ ’ \k=l(i,j)eDk fe=l(ij)eD* 




, (i,j)eGoo 


{i,j)&GQ 


Note that 


K 


E E 

(2E [Yij] £ij + ef j) + 

E 

(2E [Yij] £ij + eT) + 

E 

(2E [T*j] 

^i,j + 

k=l {i,j)£Dk 


(i,j)eGoo 


(*J)sGoi 



K* 

E E 

(2E [Tjj] £i^j + + 

E 

(2E [Tij] ejj + £‘f j) + 

E 

(2E [Yr,] 

+ ^Ij) 

^=1 (*j)eDJ 


(i.TsGqq 


(* j)sGoi 




since it amounts to summing up over all the possible indices i and j. This concludes the 
proof. 


Proof of Lemma By (20) and (22), 

Vn{t) -- 


Hence, 


/ 

2 

(E(ij)eD* 



n(n + 1) 

_1 




+ 


1 


E 




(IGool - |g;„|) 


= vf (t) + v;»(t). 


— min V'n(t) > a < 


teA 


1 / 
n,K 


- min K^(t) > f 1 +1 

tsAii" 2 


max - v;^(t) > ^ 1 + : 


- min V;°(t) > ^ 


max - V;°(t) > ^ 


.(43) 
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Let US now derive an upper bound for each of these two probabilities. Let us first address the 


first term in the rhs of (43). Note that 

K 


Vnit) > - 


n{n + 1) 


E 

fe=i 


(e,,. 




\Dh 


> - 


n(n + 1) 


f 

> max “ 




1^1 


> - 


2K 
n{n + 1) 


max 

AgXo 


(e,.. 


,j)eA 


21 


1^1 


where Id is the set of all possible diagonal blocks: 

= |{ {i,j) G {1, • • • ,n}^| t<i<j<t'} 

Thus, 


1 < t < t 


' < n| . 


(44) 


/ 


a 


max - K (t) > w < 


2K 


n{n + 1) 


max 

A&Iu 


(S(ij 


j)eA 


1^1 


> ^ 
- 2 


Moreover, from Lemma we obtain that for all positive 

^ (E,. 


U, 


max- 

A&Id 


i,j)&A 


V 


1^1 


>u\ < ^ ] 

A&Xu 




> \/u\A\ j < 2 ^ e 4/3 

A&Xd 


_ ^ 

< n(n + l)e 4/3 ^ 

where we use |Xd| < n{n + l)/2. Setting u = in the previous inequality yields 


n{n-\-l)a 

max > — I < n(n + l)e 

. te.4yL 2 


(45) 


Let us now address the second term in the rhs of (43). Note that 

2 


E -‘A (|g»i-igs„i)>-|34)^ E ^<3 


> - 


which leads to 


'E 


2 


(*j)sGoi 


(»j)gGoi 

IGoil 


a 


max - 14 ( 1 ) > - ) < 
te.4„ K 2 


E{ij)eGoi 


|Goi 


, a \ _ ItioiiE; 

> a/- <2e 8/5 , 


(46) 


where the last inequality comes from Lemma The conclusion follows from (45) and (46). 
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Proof of Lemma Using (27) Wn can be rewritten as follows: 


Wn{t) = 


n{n + 1) 
4 


K* 


K 


K* 


Y. Y ^*'4' kfc'" u ^ 


fe'=o \ (i'j')eD 


k' 


fc=0 I \(ij)eDfc / £=0 


nk,e 

nk 




+ 


n{n + 1) 


E 


K* 


E^M-.s). 




(47) 


^(*j)sDo / ^=0 
where we used that /Tq = YY=o With the notation 

6fe,£ — ^ ^ ) 

the first term in the rhs of the previous equation can be rewritten as follows: 


n(n + 1) 
4 

n(n + 1) 


K* / K 

Y\Y Y ^*'4' 

k '=0 \£'=0(i',j')eD*,nD^, 


K* K 


K* 


Y Y Y 


^£'A * 

- ^^k' 

ne 


\ ^ 

Y 

] f4' -Y 


/ k=0 

A 

K K* 

ji 

Y Y 


K* 


K* 


E E 






fe=0fci=0 £=0 


* 

- 


(48) 


fc '= 0£'=0 ti =0 

where we used that X)^=o ~ We deduce from (47) and (48) that 

K K* K* 


Wnit) = 


Thus, 


4 Uk/, ^ 4 


K* 


Y Y ~ 


n(n + 1) ' ' ni, 

'• 'k=oe=oi=o 


n(n + 1) 


E 




,(*j)6Do / 1=0 


E^m-mS). 


|W„(t)| < 


n(n + 1) 


K K* K* 

EEE 

k=0 £'=0 £=0 


max 

Aei^““uio 


E 

(i'j')eA 


e^'j' 




< 


n(n + 1) 


A 


max 

A&x^'^^^yjXo 


^ 8(i^ + l)(K- + l) ^ 

“ re(n + 1) 


where X^) is defined in ( |44| ) and 

rRir 


E 

max 

AeXp““uXo 




K K* 

EE 

k=0 £'=0 


nk 


E 




(i',j')&A 


Kmax 


k=l 


Q I {i,j) G {1,... ,n}2| 4^^ < i < 4^^ and < j < 
■ =1 

Vfe, 1 < 4^^ < 4^^ ^ ^^ < jf ^ < n + l|. 


(49) 
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Using Lemma we obtain that 


- min Wn{t) >a \ < P | ^ (-^ + 1) (-^* + 1) ^ 


teA 


l/r, 

n,K 


n{n + 1) 


max 


E 




> a 


J^max y 


^ a-r 


exp 


a^n{n + 1) 


128/3(ii: + l)^(iL* + l)^A^ 


J^max y 

'n—1 \4 


< 


■Kn 

R 


which concludes the proof by using that 

|2:^max| < < (^^„4i^a,ax 

for some positive constant Ci. 

Proof of Lemma 1^ Using (25), we obtain 


+ \Td\ < 2 


'~r -^max 

-^R 


and 


(50) 


K* K* K* 

Y1 - ^^o) = Yl Y1 - l^o) = Yl - ^^o) = Yl - ^^o)■ 

(ij)sGoo ^=0 (ij)eGoonD| 


e=o 


e=i 


By (24), we thus obtain that 
4 


Znit) = 


1 


n(n + 1) |Gi 


on 




K* 


^(*j)sGoi 

This gives the following upper bound for Zn'- 


^i,j I /^o) 

(uilsGpo (*,i)sGoo 


i=i 


|^n(t)| < 


n{n + 1) 


S(ij)eGoi ^*4 
IGoil 


max 

-K^max 


agx 


E ■ 


■'^,3 


+ 


n{n + 1) 


X(ij)eGoi ^*4 

I Coil 


where is defined in (|49|) and where we used: 






E 


^i,j 


E 




(^j)£G'oo (^j')£G'oo jOsGoonGQQ^ 

denoting the complement of the set A. Thus, 

X(jj)eGoi 


— min Zn(t) > a ] < 

tGAn,K 


+P 


8 

n(n + 1) 
4 


n{n + 1) 


ICoil 
X(ij)eGoi 


max 


(*d)6A 


ICoil 


> a/2 

noA > a/2 1 =: Pi + P 2 . 


We shall now provide upper bounds for Pi and P 2 . 
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Let US now provide an upper bound for Pi. Let Z\ and be two non negative random 
variables, then for all positive a and 7, 


' (^Z\Z2 > a) < IP (-Z’l > 7) + IP I .^2 P ~ 


a 


Applying this inequality to Pi gives 


Px < 


E 


(i,j)eGoi 


IGoil 


By Lemma we obtain 


> 7 +: 


E(i,j)eGoi 


n(n + 1 ) 


IGoil 


max 


IGnilr^ 


E ■ 

(bi) 6 A 


'*0 


a 


> 7 I < 2 e 


a I ■'“) 


(52) 


Moreover, by Lemma we obtain 


n(n + 1 ) 


max 


E ' 

(*j)eA 


a 


^■^71(71 + 1) Q;'^ 7 i(n+ 1 ) 

> ^ I < 2|PjJ™|e 5127 ^/j < 2C'in^“‘‘’'e 5127^^ ^(53) 


where we used that (see Equation (50)). 

Let us now provide an upper bound for P 2 . Using Lemma we obtain the following upper 
bound: 

IGqiIPP 

P 2 < 2e 64A2/J ^ ( 54 ) 

where we used that no < n(n + l)/2. The proof of Lemma thus follows from (51), (52), 
(j53]) and ([54]). 

Technical lemmas. 

Lemma 5. Let A be a subset o/{l,.. .n}^. Assume that ("20 holds then for all positive a, 

^ ei,j > a I < 2e"3m. 

(hi) 6 A / 

Proof of Lemma^ By the Markov inequality and (Aj^, we get for all positive rj, that 

£i,j>a \ < exp(-r/a + /3|A|??2). 

^(bi)eA / 

Taking rj = a/{2j3\A\) gives 


Y > « I < e 4 / 3 |ai 

d*d) 6 A 

This concludes the proof since the same bound holds when e* ,• is replaced by —Ej j. 


□ 
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Lemma 6. Assume that holds then, for all a > 0, 


max 

A&X 


Eij >a| <2\Z\e 2n(n+l)^^ 

(*j)6A 

where A is any subset of {{i,j) '■ 1 < i < j < n} and Z is a collection of any such subsets A. 


Proof of Lemma By Lemma 


max 

Aei 


(*,i)6A 

since | A| < n(n + 1)/2. 


> ot \ < I 

Aex 


E - 




>a <2^e <2\Z\e 2n(“+i)/3 ^ 

Aex 


□ 
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